Re: [SQL] Beginner Join question - Mailing list pgsql-sql

From Herouth Maoz
Subject Re: [SQL] Beginner Join question
Date
Msg-id l03110706b1ac40b852be@[147.233.159.109]
Whole thread Raw
In response to Beginner Join question  (calvin@improtech.co.za (Calvin Browne))
List pgsql-sql
At 18:17 +0300 on 16/6/98, Calvin Browne wrote:


> TABLE A (   TABLE B(
>   record_id   record_id
>   field1      field3
>   field2)     field4)
>
> record_id in both tables are the same.
> The record definitely exists in TABLE A, but not
> necessarily in TABLE B.
>
> I need to get the following:
>
> record_id | A.field1 | A.field2 | B.field3
> -------------------------------------------------------------------
> rec1      | xxxx     | xxxxxx   | Field displayed if there is
>           |          |          | a record in B, such that A.record_id
>           |          |          | is equal to B.record.id, otherwise
>           |          |          | leave it blank.
> -------------------------------------------------------------------
>
> I'm going to have about a million records in each table, so
> efficiency of the join is a priority. Disk space is not.

Seems you need an outer join. Not supported yet in PostgreSQL. You have one
of two options:

Make a union between a "normal" join and a NOT EXISTS query.

Or

Define an SQL function that returns the value of field3 based on given
record_id. I like this one better, because I think the union in the first
solution, in addition to the NOT EXISTS query, may make the overhead of the
function negligible.

However, if you need to have more than just one field in the query, you'll
have to define two functions, and since each of them starts its own query
whenever invoked, efficiency will deteriorate.

So:

-- Given the following two tables:

testing=> SELECT * FROM example1;
record_id|field1|field2
---------+------+------
        1|    10|   100
        2|    20|   200
        3|    30|   300
        4|    40|   400
        5|    50|   500
(5 rows)

testing=> SELECT * FROM example2;
record_id|field3|field4
---------+------+------
        1|  1000| 10000
        3|  3000| 30000
        5|  5000| 50000
(3 rows)

-- We create the following function

testing=> CREATE FUNCTION ex2_fld3( int4 ) RETURNS int4
testing-> AS 'SELECT field3 FROM example2 WHERE record_id = $1'
testing-> LANGUAGE 'sql';
CREATE

-- And here is your query:

testing=> SELECT record_id, field1, field2, ex2_fld3( record_id )
testing-> FROM example1;
record_id|field1|field2|ex2_fld3
---------+------+------+--------
        1|    10|   100|    1000
        2|    20|   200|
        3|    30|   300|    3000
        4|    40|   400|
        5|    50|   500|    5000
(5 rows)

Hope this is efficient enough for you. Don't forget to create an index on
record_id in your TABLE B.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma



pgsql-sql by date:

Previous
From: calvin@improtech.co.za (Calvin Browne)
Date:
Subject: Beginner Join question
Next
From: Sebastien Flacher
Date:
Subject: troubles with refint.c